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CHAPTER  1 


INTRODUCTION 


In  a  variety  of  applications  it  is  desirable  to  determine  the  nat¬ 
ural  frequencies  or  poles  of  a  system  from  a  noisy  observation  of  its 
impulse  response.  A  direct  approach  of  minimizing  an  error  with  respect 
to  poles  and  residues  is  a  highly  nonlinear  regression  problem.  Except 
for  low  order  cases  this  nonlinear  regression  problem  is  difficult  to 
solve  directly. 

In  general,  it  is  eesier  to  estimate  the  system  characteristic 

equation  first,  and  then  find  its  roots  from  the  estimated  equation. 

Let  be  the  maximum  likelihood  estimate  of  x.  The  invariance 

property  of  the  maximum-likelihood  estimator  [1]  states  that  if  f(x)  is 
an  invertible  function  defined  for  all  x,  then  the  maximum  likelihood 
estimate  of  f  is  just  O'J*'  problem  if  the  estimate  of  the 

characteristic  equation  coefficients  is  of  maximum  likelihood,  then  the 
roots  of  that  equation  are  the  maximum  likelihood  estimate  of  the  system 
z-poles. 


The  pole  estimation  techniques  are  extended  to  the  SEM  (singularity 
expansion  method)  parameter  estimation.  The  SEM  formalism  began  from 
experimental  observations  concerning  the  transient  electromagnetic 


response  of  complicated  scatterers  such  as  missiles  and  aircraft 
[2,3,4].  Instead  of  analyzing  various  parameters  of  a  certain  scatterer 
analytically,  we  can  compute  SEM  parameters  directly  from  the  induced 
transient  current  data.  It  was  observed  that  damped  sinusoids  were 
dominant  features  of  typical  transient  responses  [2].  The  response  of  a 
scatterer  to  an  incident  impulse  plane  electromagnetic  wave  is  expressed 


as  a  sum  of  complex  exponentials.  This  sum  depends  on  a  few  param¬ 
eters,  namely  natural  frequencies  which  do  not  depend  on  an  observation 
location  and  an  incidence  direction,  coupling  coefficients  which  des¬ 
cribe  the  coupling  between  the  incident  field  and  the  scatterer,  and 
natural  modes  which  describe  the  spatial  amplitude  variation  in  the 
induced  current  [2,3,4]. 

In  Chapter  2,  we  review  the  covariance  method  and  the  singular 
value  decomposition  (SVO)  method.  Also  a  new  algorithm  is  described  to 
extract  the  reduced  characteristic  equation  from  the  weakest  eigen¬ 
vectors  when  the  system  order  is  overdetermined. 

In  Chapter  3,  the  iterative  preprocessing  algorithm  (IPA)  is  pre¬ 
sented.  This  is  related  to  the  Steigl itz-McBride  algorithm  [5,6].  But 
this  IPA  not  only  reduces  significantly  the  computational  burden  of  the 
existing  algorithm;  it  also  improves  the  stability.  The  approximate 
iterative  preprocessing  algorithm  (AIPA),  which  further  reduces  the  com¬ 
putational  burden,  is  discussed.  The  AIPA  for  the  pure  sinusoid  case  is 
related  to  Kay's  iterative  filtering  algorithm  (IFA)  [7]. 

In  Chapter  4,  the  Cramer-Rao  bound  for  the  system  transfer  function 
parameters  is  described.  This  is  a  very  important  tool  for  evaluating 
different  estimators. 

Some  simulation  results  for  the  new  SVD  method,  the  IPA,  and  the 
AIPA  are  given  in  Chapter  5.  The  sample  variances  of  the  coefficients 
using  the  IPA  lie  exactly  on  the  C-R  bound  curve  for  moderate  SNR.  This 
argues  that  the  IPA  converges  to  the  maximum  likelihood  estimator  for 
some  range  of  SNR  values. 

In  Chapter  6,  the  general  SEM  formalism  is  reviewed  and  a  new 
algorithm  is  described  for  estimating  natural  frequencies  given  multiple 
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observations.  Also  the  C-R  bound  is  calculated  for  the  characteristic 
equation  coefficients  for  given  multiple  data  sets. 

In  Chapter  7,  an  iterative  scheme  is  proposed  for  estimating  SEM 
parameters,  such  as  coupling  coefficients,  natural  modes,  and 
normalization  factors. 

Some  experimental  results  for  SEM  parameter  estimation  are  given  in 
Chapter  8. 


CHAPTER  2 
SVO  METHOD 


2.1  COVARIANCE  METHOD  AND  SVO  METHOD 

Suppose  that  there  are  N  samples  of  the  observed  data  sequence 
y(nT)  consisting  of  K  complex  exponentials  plus  a  zero-mean, 
uncorrelated  noise  e(nT)  such  that 


y(nT)  *  I  c^exp(sjnT)  +  e(nT)  for  n  =  0,  1,  N-1 

i=l  ^  ’ 


where  (s.)  are  the  poles,  (c. )  are  the  residues,  and  T  is  the 
time  increment  between  successive  samples.  Let 


where  the  z.  =  exp{s^.T)  are  the  poles  in  terms  of  the  z-trans- 
form  variable.  Then  with  no  loss  of  generality,  (1)  becomes 


n  n  n 


The  characteristic  equation  with  respect  to  T  is 


B(z)  =  bQ  +  b^z  ^  +  ...  +  ^ 


Once  we  obtain  an  estimate  of  B(z)  from  the  data  sequence  (y^), 
finding  the  roots  of  B(z)  =  0  is  an  algebraic  procedure.  The  inverse 


T 


▼ 


T 


«.*  WT  V  *.*  AT  >* 


s.  =  [loglz^  I  +  j{arg(z^)}]/T 


is  used  to  convert  z-poles  to  s-poles,  and  residues  can  be  found  by  a 
linear  least  squares  method.  Thus  the  main  problem  is  estimating  the 
coefficients  (b^). 

In  terms  of  the  difference  equation 


K 

I  b.w„  .  =  0  for  n  =  K,  K+1,  ....  N-1  (5) 

i=0 

K 

Define  d^  =  t)^e^_^  for  n  =  K,  K+1,  N-I. 


With  noise  contaminated  data,  (5)  becomes 


K 

I  i  *  1  =  X.  X+1,  ....  N-1  (6) 

i_0  I  n  I  n 


where  the  (dp)  are  due  to  noise  and  are  termed  the  equation  error.  In 
matrix  form  (6)  becomes 


f  *v 

^K+l 

• 

• 

• 

• 

• 

• 

• 

• 

‘’o 

• 

• 

^N-l 

4 

'^N-l 

Y  X  =  d  (7) 


}  .  • 


'."v 


I 

sv 


U 


By  letting  bg  =  1,  (7)  becomes 


Y|_x,  *  Yr  •  d 


where  is  the  matrix  consisting  of  the  first  K  columns  of  Y,  yj^  is  the 
(K+l)th  column  of  Y,  and  Xj  =  [bj^,  \-l*  *•*»  1  minimize  d  d  of 

(7),  we  need  to  solve 


Yl\xi  =  .Yl*> 


This  approach  to  estimating  the  (b^  iS  called  the  covariance  method 
[8].  It  has  been  observed  that  this  method  has  a  statistical  bias  in 
the  estimates  [9].  The  least  squares  process  converges  as  N  becomes 
large,  but  it  yields  biased  parameter  values  which  are  a  function  of  the 
standard  deviation  of  the  noise.  It  is  unfortunate  that  even  for  small 
noise  levels,  this  bias  will  produce  large  errors  on  the  estimate  s^. 

Kay  [10]  examined  a  similar  problem  in  autoregressive  spectral  estima¬ 
tion. 

The  singular  value  decomposition  (SVD)  approach  deals  directly  with 
(7).  The  SVD  is  used  to  solve  the  homogeneous  equation,  i.e.  to  find 
the  null  space  of  the  Y  matrix.  The  SVD  of  Y  is 


Y  =  USV 


where  S  * 


V 

•-'■'■.V 


0 
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Here  (a.)  are  termed  the  singular  values,  which  are  the  square  roots 
of  the  eigenvalues  of  Y*Y,  and  i  02  ^  i  ^  ^  ^re  unitary 

matrices  of  orders  (N-K)x(N-K)  and  (K+l)x(K+l)  respectively.  Let  S'  be 
the  resulting  matrix  obtained  by  forcing  the  smallest  singular  value 
®K+1  *  ^  matrix  which  is  constructed  in  the  same 

manner  as  Y  with  noise-free  data.  The  matrix  Y'  *  US'V*  turns  out  to 
be  the  closest  matrix  of  rank  K  to  Y  in  the  sense  of  Frobenius  norm 
[11].  Because  the  rank  of  the  matrix  W  is  K,  it  is  reasonable  that  the 
weakest  eigenvector  of  Y*Y  is  an  estimate  of  coefficient  vector  x. 

But  what  statistical  properties  do  the  eigenvectors  have?  In  particular 
can  we  say  that  the  weakest  eigenvector  is  an  unbiased  estimator  of  x? 

It  was  pointed  out  in  [12]  that 


E[Y*Y]  =  W*W  +  I(N-K)a2 


(11) 


where  E  is  an  expectation  operator,  I  is  the  identity  matrix,  and  is 
a  variance.  The  eigenvalue  decomposition  of  (11)  is 


E[Y*Y]  =  V[D  +  I(N-K)a2]V' 


(12) 


where  D  is  the  diagonal  matrix  whose  elements  are  the  eigenvalues  of 
W*W  and  V  is  the  modal  matrix  of  W*W.  Note  that  the  eigenvectors  of 
W*W  are  preserved  and  each  eigenvalue  is  increased  by  (N-K)a^.  This 
property  can  be  used  for  order  selection.  But  (12)  itself  cannot  be  a 
proof  of  the  supposition  that  the  expectation  of  the  eigenvectors  is 
unaffected  by  noise,  i.e.  unbiased,  because  the  eigenvectors  of  E[Y*Y] 
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and  the  average  of  eigenvectors  of  Y*Y  will  be  different.  In  fact  a 
small  perturbation  on  W*W  might  give  a  large  error  of  eigenvector 
estimation,  especially  when  some  eigenvalues  are  clustered  together 
[13].  At  present  no  proof  is  available  to  support  the  claim  that  the 
expectation  of  the  eigenvectors  is  unaffected  by  noise.  However,  some 
simulation  results  show  that  pole  estimates  using  SVD-based  methods  are 
much  less  biased  than  those  using  covariance  method  [14,15]. 

2.2  DEFLATION  ALGORITHM 

So  far  the  matrix  Y  was  taken  to  be  (N-K)x(K+l).  But  the  true  (or 
appropriate)  order  of  a  system  is  usually  unknown  a  priori.  Construct¬ 
ing  a  data  matrix  as  (N-M)x(M+l)  and  solving  it  to  obtain  an  estimate 
results  in  M-K  extraneous  poles.  It  has  been  observed  empirically  that 
the  presence  of  extraneous  poles  seems  to  protect  the  true  poles  against 
noise-induced  perturbation  [14,15].  Unfortunately  it  is  not  known  how 
to  select  an  optimum  M  given  N  and  K  without  examining  the  error  for 
jach  M. 

If  we  overestimate  the  order,  then  the  SVD  of  W  (of  order  (N-M)  x 
(M+1))  becomes 


W  =  USV* 


where  S  = 


(13) 


•.v;. 


r 
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Here  >  <^2  >  ...  1  *  Vl  =  '’K4-2  *  “  Vl  =  °  ^ 

unitary  matrices  of  the  appropriate  orders.  The  last  M-K+1  columns  of  V 

are  in  the  null  space  of  W.  Let 


^  *  l^Vr  V+2’  ***’  ''m+J 


(14) 


where  is  the  ith  column  of  V. 

The  columns  of  H  span  the  null  space  of  W.  Define  an  (M+l)x{M-K+l)  p 

r  ^ 

mat  ri  x  ■>' 


It  is  shown  in  appendix  A  that  every  column  of  G  is  in  the  null  space  of 
W  so  that  columns  of  G  are  the  linear  combinations  of  columns  of  H  and 
vice  versa.  By  introducing  (M-K+1 )x(M-K+l)  matrix  R,  G  and  H  are 

•A‘v 

related  as  follows.  -  r 


G  =  HR 


(16) 


Given  the  SVD  of  Y,  we  can  examine  the  a. 's.  If  there  is  a  signifi¬ 
cant  drop  between  successive  values  of  and  and  if  the 

singular  values  after  ont+l  relatively  small  and  there  are  no  big 
changes  between  successive  singular  values,  then  the  order  can  be  selec- 

A 

ted  as  K  =  m.  Now  any  column  vector  of  H  is  a  candidate  for  a  coeffic¬ 
ient  vector.  The  problem  is  how  to  estimate  G  efficiently,  given  H. 

The  essence  of  Henderson's  deflation  algorithm  [16]  is  as  follows. 

1)  Perform  a  forward  Gauss  elimination  with  partial  pivoting  to 
reduce  to  an  upper  trapezoidal  matrix,  leaving  a 
triangular  pattern  of  zeros  in  the  lower  left  corner. 

2)  Perform  a  backward  Gauss  elimination  without  pivoting  to  reduce 
a  triangular  pattern  of  zeros  in  the  upper  right  while 
preserving  those  in  the  lower  left. 

This  resulting  matrix  was  suggested  [16]  as  an  approximate  G^ 
except  for  row  scaling.  There  is  a  problem  with  this  algorithm. 

Because  of  the  finite  word  length  of  a  digital  computer,  the  use  of  a 
backward  Gauss  elimination  without  pivoting  may  result  a  large  error. 
This  error  may  emphasize  certain  directions  which  are  not  dominant 
ori gi nally. 

We  suggest  an  algorithm  that  overcomes  this  difficulty.  Let  g^*”^ 

(m) 


be  the  mth  column  of  G  and  r 
Equation  (16) 


be  the  mth  column  of  R,  then  from 


By  partitioning,  (17)  is  rewritten  as 


* 

0 

• 

0 

p(m)  ,, 

^2 

0 

«3 

• 

0 

•  t 

(18) 


Because  the  (b^ )  are  unknown,  r^*"^  can  be  estimated  such  that 
satisfies  the  following  set  of  M-K  homogeneous  equations. 


(19) 


Intentionally,  (19)  is  changed  to  the  set  of  inhomogeneous  equations 
such  that 


’«il' 

*^1R 

(ml 

where  Pj  is  the  vector  consisting  of  the  first  M-K  elements  of  r^  , 
*^1L  matrix  consisting  of  the  first  M-K  columns  of  (H^) 

and  h^i^  column  of  Hj(H2).  Gauss  elimination  with 

(possibly)  complete  pivoting  is  used  to  solve  Equation  (20)  for  each  m 
to  get  R.  Now 

HR  =  G  (21 

A 

G  in  (21)  is  intended  to  approximate  G  except  for  column  scaling.  From 
G,  a  (K+l)x(M-K+l)  matrix  Q  is  obtained  by  eliminating  the  zeros  and 
shifting  every  element  of  the  kth  column  of  G  upward  (k-l)  times  for 
k  =  1,  ....  M-K+1.  Each  column  of  Q  should  be  an  estimate  of  the 
coefficient  vector.  Taking  an  average  of  (M-K+1)  columns  of  Q  after 
normalizing  each  column  with  respect  to  the  last  element  of  that  column 
is  the  final  reduced  estimate  of  the  true  coefficient  vector.  Gauss 
elimination  with  complete  pivoting  to  estimate  each  r^"^'  should  cure 
the  problem  of  emphasizing  a  particular  direction  with  Henderson's 
deflation  algorithm.  And  averaging  normalized  columns  of  Q  should 
alleviate  the  problem  further.  Simulation  results  for  the  examples 
chosen  for  Experiment  I  in  Chapter  5  show  that  sample  variances  for 
coefficients  with  the  new  algorithm  were  always  smaller  than  those  with 
Henderson's  deflation  algorithm  for  K  =  2  case.  Also  Experiment  II  in 
Chapter  5  shows  that  the  bias  with  the  new  algorithm  was  relatively 
smaller  than  that  with  the  existing  method. 


CHAPTER  3 


ITERATIVE  PREPROCESSING  ALGORITHM 


The  problem  of  estimating  poles  can  be  posed  in  terms  of  a 
time-domain  system  identification  problem.  Suppose  the  unknown  system 
has  a  K-th  order  transfer  function  given  by 


ao  +  a^z 


-(K-1) 


bQ  +  bjz 


(22) 


+  w^z 


where  =  1  for  convenience  and  (w^)  are  samples  of  the  impulse 
response.  This  can  be  written  in  terms  of  the  difference  equation 


I  h.W  .  =  a  ,  n  *  0,  1,  ...,  K-1 

i=0  '  "  '  (23) 

K 

.1  bW  i  “ 

i=0 


or  in  matrix  form  (with  N  samples  of  the  impulse  response) 
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(24)  can  be  rewritten  as 


b,  1 


bi  1 


We  observe  y_  *  w  +  e„  for  n  =  0,  I,  N-1,  where  (e  )  are  samples 
n  n  n  n 

of  uncorrelated,  zero-mean  noise.  In  vector  form 


y  =  w  +  e 


where  y  =  [yg** •  • ’yN-l  ®  y  '*® 

estimate  the  denominator  coefficients  in  (22).  Let  Y  be  the  matrix 
which  is  constructed  in  the  same  manner  as  W  with  (y^)  as  elements. 
Then  from  (24)  we  have 


Yx  =  +  d 

0 


where  d  *  (Y  -  W)x  *  Be  is  called  the  equation  error  vector.  Now  (27) 
can  be  rewritten  as 


••  v-'S’S  S'-.v 
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Be  =  Yx  - 


Because  Matrix  B  is  lower  triangular  and  has  I's  as  diagonal  elements, 
it  is  nonsingular  and  invertible.  The  inverse  of  matrix  B  is  easily 
obtained  by  forward  substitution. 


F  =  B 


^N-2  ^N-3  . 

Vl  ^N-2  . 


fg  f^  1 


where  f^  = 


and  f .  = 


K 

i=l  ^  J  ’ 


1  <  j  <  K 


K+1  <  j  . 


Note  that  1/B(z)  =  1  +  I  fiZ"'  . 

i=l 


By  introducing  the  matrix  F,  (28)  becomes 


e  =  F[Yx  - 


]  =  FYx  -  F^a 


S 


where  is  the  matrix  consisting  of  first  K  columns  of  F.  Let  be 
the  matrix  consisting  of  first  K  columns  of  Y  and  be  the  K+1^^  column 

of  Y.  Note  that  y  =  yj^.  Define  Xj  =  [bj^,  b|^_^ . (9) 

becomes 


e  =  FY^Xj  +  Fy  -  F^_a 


Because  F  (and  so  F^^)  is  a  function  of  (b^.  )*  minimizing  a  sum  of  squared 

it 

error  J  =  e  e  is  unfortunately  a  highly  nonlinear  problem.  But  fixing  F 
as  a  constant  matrix  and  setting  the  derivatives  of  J  with  respect  to  x^ 
and  a,  we  get  two  conditions: 


Fl  a  -  Fl  FY^  Xj  =  Fl  Fy 


-  (FYJ  F^  a  +  (FY^)  FYl  Xj  =  -(FY^_)  Fy 


For  a  fixed  F,  the  problem  is  linear  in  a  and  Xj .  An  iterative  pro¬ 
cedure  can  be  used  to  minimize  J.  The  first  estimate  of  a^^^  and 
results  from  taking  F^^^  as  the  identity  matrix.  With  this  we  can 

construct  F^^^  to  get  a  new  estimate  a^^^  and  and  so  forth.  At 

each  iteration,  the  previous  denominator  coefficients  are  used  to  get 
new  estimates,  so  that  a^'^^and  are  found  by  solving  a  system  of 

linear  equations 

•  ”  ^  r  ^  ^  ' 

(;p(ni-l))*p(m-l)  .  j-p(m-l)  ^*p(m-l);^  ^(m) 


r.'  -D:  i*c(ni-l)  rc(m-l):  i*c(m-l):  I  (m) 


^p(m-l)Y  ^*p(m-l)y 


If  convergence  is  obtained,  J  =  e  e  is  minimized,  which  is  the  error 
we  want  to  minimize.  But  if  the  SNR  is  very  low,  it  is  hard  to  expect 
convergence  in  general.  One  can  check  that  (34)  is  an  explicit  form  of 
the  Stei glitz-McBride  iterative  algorithm  with  an  impulse  input  [5,6]. 
This  method  has  been  widely  used  and  the  convergence  and  accuracy  prop¬ 
erties  for  large  data  lengths  and/or  high  SNR  assumptions  are  well-known 
[17].  At  each  iteration,  it  is  necessary  to  solve  a  system  of  2K  linear 
equations. 

3.1  ITERATIVE  PREPROCESSING  ALGORITHM 

•k 

Because  the  rank  of  the  matrix  is  K,  the  matrix  F^^  F^^  is  the 
hermitian  positive  definite  and  invertible.  From  (32),  we  get 

Substituting  (35)  into  (33)  we  obtain 


(FYJ*  q  FY^  Xj  =  -(Fy’j*  Q  Fy  (36) 

where  Q  =  I  -  Fl(I"l*''l 


With  a  fixed  F,  this  equation  is  linear  in  Xj.  An  iterative  procedure 
analogous  to  that  mentioned  above  is  used  to  solve  (36).  The  first 
estimate  of  x^^^^  results  from  setting  F^*^^  =  I.  With  this  we  can 

construct  F^^^  and  calculate  9®^  estimate 


and  so  forth 


^p(^”1)y  Q  ('^“1  )  Y  j  )  Q  )p  )y 


where  =  I  -  p^{m-l)^p^(ni-l)*p^{ni-l)  ^-1  ^P^{m-1) 


for  m  =  1,  2,  ....  and  =  I  . 

At  each  iteration  we  need  to  solve  a  system  of  only  K  linear  equations 
which  is  an  advantage  compared  to  the  S-M  algorithm.  If  convergence  is 
obtained,  then  again  the  mean-square  error  is  minimized. 

Before  proceeding  further  let  us  introduce  the  Evans-Fischl 
algorithm  [18].  Partitioning  the  matrices  in  (28),  we  get 


De  =  Yx  =  d 


where  0  is  the  matrix  of  the  last  N-K  rows  of  B,  and  Y  and  d  are  as  in 
(7).  The  error  we  would  like  to  minimize  is  e*e.  Because  the  matrix 
D  is  (N-K)xN,  we  cannot  solve  for  e  directly.  But  using  the  SVD 

D  =  USV*  (  00*  =  U(SS*)U*  )  , 

+  * 

Thus,  e  =  VS  U  d  which  is  the  minimum  norm  solution  so  that 
min  e  e  =  min  d  U(SS  )~^U  d  =  min  d  (00  )"^d 

i.e.. 


it  it  if 

min  e  e  =  min  (Y^Xj  +  y^)  (00  +  y^) 
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where  and  are  defined  as  in  (8).  With  a  fixed  DO  ,  the  minimum  of 
e*e  would  be  given  by  the  solution  for  from  the  normal  equation 

(DD*)'^Yl  Xj  =  -Y^_*(DD*)'\  (40) 

Since  D  does  in  fact  depend  on  x^,  we  can  use  (40)  but  with  itera¬ 
tions.  It  is  shown  in  Appendix  8  that  the  iterative  preprocessing 
algorithm,  i.e..  Equation  (37)  is  basically  the  same  as  the  Evans-Fischl 
algorithm.  At  each  iteration  an  (N-K)x(N-K)  matrix  inversion  is 
required  in  the  E-F  algorithm,  but  inversion  of  KxK  matrix  is  required 
for  the  IPA.  Readers  might  notice  that  the  matrix  multiplication  FY  is 
nothing  but  an  inverse  filtering  and  it  requires  only  about  NK-K^/2  mul¬ 
tiplications.  When  N  is  very  large,  the  IPA  which  avoids  inversion  of  a 
large  matrix  ((N-K)x(N-K) )  should  be  more  stable  than  the  E-F  algor¬ 
ithm.  For  large  N,  about  (N^  +  N^K)/6  multiplications  except  for 
solving  a  system  of  K  linear  equations  are  required  at  each  iteration  of 
the  E-F  algorithm.  The  Equation  (36)  is  rewritten  as 

[(Fy')*(FY)  -  (Fl*FY)*(Fl*Fl)’^F^*FY)]  X  =  0. 

As  mentioned  before  NK-K^/2  multiplications  are  required  to  compute  each 
of  F  and  FY.  To  obtain  (FY)  (FY),  NK^/2  multiplications  are  required. 
Because  of  the  special  forms  of  Fj^  and  FY,  only  2NK  is  required  to  com- 

if 

pute  (F|_  FY).  Thus,  including  an  inversion  of  (F^^  F^^),  about  (3K^/2  + 
4NK  +  NK^/2)  multiplications  except  solving  a  system  of  K  linear  equa¬ 
tions  are  required  at  each  iteration  of  the  IPA  which  is  a  great  advan¬ 
tage  compared  to  the  E-FA  especially  when  N  »  K. 
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3.2  APPROXIMATE  ITERATIVE  PREPROCESSING  ALGORITHM 

The  disadvantage  of  the  IPA  is  that  a  KxK  matrix  inversion  is 
required  at  each  iteration.  In  at  least  two  cases  of  interest  we  can 
avoid  such  a  matrix  inversion. 

3.2.1  Damped  Sinusoid  Case 

If  the  signal  consists  of  damped  sinusoids,  i.e.  each  |z^|  <  1  for 
i  =  1,  2,  ....  K,  then  from  a  practical  viewpoint  the  signal  has  finite 
duration.  This  observation  can  be  used  to  simplify  the  calculation  of 
(Fl*Fl]”^.  As  a  result  both  the  computational  load  and  the  stability 
of  the  IPA  are  improved.  Suppose  the  B  matrix  is  partitioned  as 


*  _i 

Let  R  =  lim  (F^^  Fj^].  It  is  shown  in  Appendix  C  that  R'  relates  to  the 
N-h» 

matrix  B  very  simply  as 

R'^  =  BqBq*-  Bi*Bi  (42) 

Approximating  (F^  finite  N  by  R”^,  (36)  becomes 

(PYJ*  Q  Fy"l  Xj  =  -(FYJ*  Q  Fy 
where  Q  =  I  -  FlR"^Fl* 


(43) 


Convergence  and  accuracy  properties  are  not  available  yet,  but 
simulation  results  for  the  examples  chosen  in  Chapter  5  show  that  this 
approximate  iterative  preprocessing  algorithm  (AIPA)  converges  to  the 
maximum-likelihood  estimator  for  moderate  SNR. 

3,2.2  Pure  Sinusoid  Case 

*  N-1 

It  is  shown  in  Appendix  D  that  lim  F|^)  =  0  when  the  signal 

N-h» 

consists  of  pure  sinusoids,  that  is  when  the  roots  of  B(z)  =0  are  on 
the  unit  circle.  Because 

lim  a  =  lim  (FL*FLrM''L**'Yj^Xj  -  F^^]  =  0  , 

N-hb  Nx» 


(33)  becomes 

(FYJVl  Xj  =  -(FYj*Fy  (44) 

These  equations  are  the  same  as  Kay's  IFA  (iterative  filtering  algor¬ 
ithm)  [7].  Our  derivation  emphasize  that  (44)  is  an  approximation  which 
is  good  when  N  -►  »  .  We  show  in  Appendix  B  that  (36)  can  be  rewritten 
as 


^^2^L^  Q  F2YL  ®  ^  ^2^R 

where  Q  *  I  -  Fj^(F^^ 

Here  Yl  and  Yr  are  defined  as  in  (8)  and 


’Fq  I 

0 

}k 

F.  = 

■'0  ' 

^  i 

•=2 

^  N-K 

\ 

K  N-K 
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For  large  N  we  can  use 


=  0  to  approximate  (45)  as 


(F2Yl)*(F2Yl)  Xj  =  -(F2\)%^r  (^6) 

Equation  (46)  is  the  same  as  (44)  when  N  =  »  .  Note  that  the  matrix 
multiplication  FgY  is  different  from  the  ntatrix  multiplication  FY,  be¬ 
cause  the  former  filters  the  columns  of  Y  separately  and  the  latter 
filters  the  data  (yg,  y^,  ....  y^_j). 

Equation  (30)  for  the  error  is  rewritten  as 


e  =  Fl(YqX  -  a)  +  F2YX  (47) 

where  Yq  is  the  matrix  of  the  first  K  rows  of  Y,  To  minimize  e  e  given 
a  =  YqX  with  a  fixed  F2,  we  need  to  solve  the  set  of  normal  equations 
which  is  identical  to  (46)  with  iterations.  Both  Equations  (44)  and 
(46)  are  good  when  N  =  «  .  Equation  (44)  comes  from  setting  a  =  0. 

This  corresponds  to  setting  the  initial  conditions  in  the  difference 
equation  to  zero.  If  we  use  Equation  (44)  to  estimate  frequencies  when 
N  is  small,  a  =  0  is  not  appropriate.  This  explains  why  Kay  [7]  got  a 
large  sample  variance  for  the  frequency  estimates  with  large  SNR.  On 
the  other  hand  Equation  (46)  assumes  a  =  Y^x,  i.e.  sets  initial  condi¬ 
tions  to  yQ.y^ . ^K-1’  reasonable  especially  when  N  is 

small.  Because  (46)  filters  the  columns  of  Y  separately,  it  requires  an 
additional  (N-K)»(K-1)^  multiplications  compared  to  Kay's  method. 


CHAPTER  4 


CRAMER-RAO  BOUND  FOR  COEFFICIENT  ESTIMATION 

The  previous  chapter  described  an  algorithm  for  estimating  natural 
frequencies.  We  do  not  know  how  to  describe  generally  the  convergence 
property  of  the  IPA.  Simulation  results  using  the  IPA  can  be  compared 
with  simulation  results  using  other  methods.  But  is  is  more  interesting 
to  compare  with  the  Cramer-Rao  lower  bound. 

In  this  chapter  we  shall  describe  a  calculation  of  the  C-R  bound 
for  the  coefficients  of  the  characteristic  equation.  The  maximum-like¬ 
lihood  estimate  is  an  unbiased  estimate  whose  conditional  variance  sat¬ 
isfies  the  C-R  bound  with  equality.  As  mentioned  in  the  introduction  if 
the  IPA  converges  to  the  ML  estimate  for  the  characteristic  equation 
coefficients  then  it  is  maximum-likelihood  for  the,  natural  frequencies 
as  well.  The  C-R  bound  is  stated  as  follows  [1].  Let  y  be  an  observa¬ 
tion  vector  and  0  be  any  unbiased  estimate  of  a  vector  e,  then  the  con¬ 
ditional  variance  is  bounded  by 

var(0.l9.)>  (48) 

where  the  matrix  J  is 

J  {  I0  P.(yl6)}^]  (49) 

and  p(yl0)  is  the  probability  density  function  for  the  observation  given 
the  unknown  parameter  vector.  In  our  case,  the  model  is  either 


ss 


Rewrite  Equation  (50)  as 


^  \-l 

0 


Vk-1  ”n-k 


w  =  -  '.^L 


We  claim  that 


In  «»  -  FW 

3  X  ^  ^ 


Let  =  fnth  row  of  the  identity  matrix, 

=  mth  row  of  the  matrix  F,  and 
=  mth  column  of  the  matrix 
Then  from  (29)  it  is  easy  to  show  that 


And  It  IS  obvious  that 


l(m)y(j)  .  I  Vj-(K+2) 


for  ni+j^K+2 
for  iiH-j<K+2 


We  will  use  induction  to  show  that  (56)  holds.  From  (55),  it  is  clear 


3w.  3w 

^  ■  "0  ^  .  <  1 


for  i  =  1,  2,  ....  K  . 


Thus,  Equation  (59)  satisfies  Equation  (56).  Suppose  Equation  (56) 
holds  for  m  =  0,  1,  ...,  M,  i.e. 


3w_ 

3E“ 


for  1  “  I9  2}  •••9  K  dnd  m  ■  Oy  ly  •••y  M* 

Then  for  m  =  M  1 , 


3  r  T  h  W  1 

■5E-- 


Vl-1  -  '>j  sb-, 


*  T  U  F(M*2-j):  (K-Hl) 

Vi-i  °y  "l 


by  (60) 


A-\  fiJil]  G(e)  -  (y-w) 


because  E[(y-w)]  =  0  and  =  G(8) 


h  [le  P(yle)]  | 

=  —  G 

^(9)G(9) 

1 

■7^ 

(fwJ^fw'l  -  (FWl)^Fl 

(6 


Final ly , 
J-1 


(6 


What  we  are  interested  in  is  the  C-R  bound  for  the  denominator 
coefficients  8^,  62,  ....  9,^.  It  is  easy  to  show  that 

so  that 


var(9.-0. )  1  p. . 

where  9^.  =  for  i  =  l,2, 

Note  that  the  C-R  bound  is  related  to  the  noise  free  data  and  the  coef 
ficients  of  the  characteristic  equation.  The  matrix  P"^  is  the 
matrix  of  the  normal  equation  which  is  solved  at  each  iteration  of  the 
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CHAPTER  5 

SIMULATION  RESULTS 

In  this  chapter  we  give  two  kinds  of  experimental  results  to 
support  the  algorithms  proposed  in  previous  chapters. 

5.1  EXPERIMENT  I 

First  we  consider  two  data  sets  consisting  of  a  pair  of  complex 
conjugate  damped  sinusoids  in  white  Gaussian  noise. 

^n  "  ®n  n  =  0,  1,  ....  N-1 

The  first  data  set  has  =  1  and  «  -0.8  +  jO.5  so  that 


2  +  l.fiz 

H  (z)  =  - ^ - - 

I  +  1.6z  ^  +  0.89Z 


SNR  is  defined  as  SNR  =  20* log^Q {2/a)  where  a  is  the  standard  deviation 
of  the  additive  noise  ep.  Note  that  this  is  peak  SNR.  C-R  bounds  for 
bi  =  1.6  and  hz  =  0.89  were  calculated  by  the  method  suggested  in 
Chapter  4.  The  variances  with  respect  to  the  true  coefficients  were 
calculated  after  500  trials  at  different  SNR's  for  each  method.  The 
data  record  length  N  was  30  so  that  the  AIPA  is  very  close  to  the  IPA, 

A  21x10  data  matrix  was  used  for  the  SVD  method.  As  shown  in  Figure  1, 
both  the  AIPA  and  the  S-M  algorithm  performed  as  maximum  likelihood 
estimators  up  to  12  dB  of  SNR.  But  when  the  SNR  was  10.5  dB,  the  AIPA 
definitely  outperformed  the  S-M  algorithm.  This  indicates  that  the  AIPA 
was  more  stable  than  the  S-M  algorithm.  For  the  SVD  method,  eigenvec¬ 
tors  corresponding  to  the  8  weakest  eigenvalues  were  used  to  estimate 
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Figure  1.  (a),(b)  Comparison  of  pole  estimators 

(bi  =  1.6,  bo  =  0.89) 


the  coefficient  vector  as  described  in  Chapter  2.  The  sample  variance 
of  coefficients  obtained  by  the  SVD  method  was  about  the  double  the  C-R 
bound. 

As  a  second  example,  =  1  and  =  -0.7  +  jO.5  were  used. 

The  SNR  is  defined  as  before  and  N  =  20.  Again  500  trials  were  per¬ 
formed  at  different  SNR  for  each  method. 


H2(z)  = 


2  +  1.4z' 


1  +  1.4z‘^  +  0.74z'^ 


As  shown  in  Figure  2,  results  from  both  the  AIPA  and  S-M  algorithm  are 
poor  below  SNR  =  17  dB.  But  still  we  can  see  that  the  AIPA  performs 
better  than  S-M  algorithm  for  low  SNR.  A  14x7  data  matrix  was  used  for 
the  SVn  method.  Again  the  sample  variances  for  the  coefficients 
obtained  by  the  SVD  method  was  about  the  double  the  C-R  bounds.  It 
should  be  noted  that  the  new  SVD  method  that  combines  some  weakest 
eigenvectors  always  performed  better  than  Henderson's  deflation  algor¬ 
ithm.  Because  the  difference  was  about  1-2  dB  for  both  examples,  we  do 
not  show  that  on  the  graph.  For  higher  order  cases,  we  would  expect  to 
see  a  bigger  difference.  As  mentioned  in  the  introduction,  if  the  esti' 
mator  is  the  maximum-likelihood  estimator  for  the  coefficient  vector 
then  pole  estimates  via  this  estimator  are  also  maximum-likelihood. 

This  notion  is  demonstrated  in  the  next  examples. 

5.2  EXPERIMENT  II 

In  this  experiment  we  used  three  data  sets  consisting  of  complex 
damped  sinusoids  in  white  Gaussian  noise. 


Figure  2.  (a),(b)  Comparison  of  pole  estimators. 
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K 

y  =  ^  exp(s.n)  +  e  n  =  0,  1 . 24 

"  i=l  ^  " 

where  s.  =  +  j2nf. 

These  three  data  sets  were  chosen  by  Kumaresan  and  Tufts  [14].  The 
sequence  en  is  white,  and  complex  Gaussian,  with  variance  2a^.  SNR  is 
10»logio(l/2<J^) .  The  IPA  was  used  to  estimate  the  coefficient  vector 
and  then  the  pole  damping  factors  (a-j)  and  pole  frequencies  (f^) 
were  calculated  for  each  trial.  The  SVD  method  described  in  Chapter  2 
was  used  for  the  K  =  1  case.  The  C-R  bounds  as  well  as  the  backward 
covariance  method  using  the  SVD  .results  have  been  extracted  from  [14]. 
For  K  =  1,  fi  =  0,52  was  used.  As  shown  in  Figure  3,  for  =  0.1,  the 
IPA  indeed  performed  as  a  maximum-likelihood  estimator  up  to  10  dB  of 
SNR.  The  new  SVD  method  performed  as  well  as  K-T's  method.  An  18x8 
data  matrix  was  used  for  the  new  SVD  method  compared  to  7x18  data  matrix 
for  K-T's  method  which  means  computing  time  for  the  new  SVD  method  is 
much  less.  Note  that  the  sample  variance  of  the  estimates  in  this  one 
pole  case  show  broad  minimum  somewhere  between  M  =  6  to  10  for  the  new 
SVD  method.  For  the  results  in  Figure  4,  =  0.2  was  used.  Again  the 

IPA  performed  as  a  maximum-likelihood  estimator  up  to  15  dB  of  SNR.  The 
sample  variance  using  the  new  SVD  method  was  larger  than  that  for  K-T's 
method.  But  as  Table  1  shows,  the  bias  for  is  smaller  with  the  new 
SVD  method.  Averaging  reduced  coefficient  vectors  as  mentioned  in  Chap¬ 
ter  2  might  have  reduced  the  bias. 
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For  K  =  2,  we  set  =  0.1,  -  0.2,  =  0.52,  and 

=  0.42.  Figure  5  shows  that  IPA  performs  well  up  to 
11  dB  of  SNR. 


10  log  (l/var) 


10  log  (l/var) 


(oti  “  0,1,  cxp 


=  0.-?,  f.  =  0.52,  f,  =  0.42) 


CHAPTER  6 


APPLICATION  OF  THE  I PA  TO  THE  SEM 

The  singularity  expansion  method  (SEM)  form  of  the  surface  current 
density  on  a  finite-dimension,  perfectly  conducting  object  in  free  space 
is  written  as  [2,3,4] 

^s(rs.t) 

+  other  SEM  terms  +  noise 

where 

tQ  turn-on  time 

Jc  (K)  -  natural  mode  (appropriately  normalized) 

S|^  =  natural  frequency 

^^(max)  _  normalization  factor 

r\|^  =  coupling  coefficient  (appropriately  normalized) 

=  direction  of  incidence 
3p  =  polarization  vector 

fp(s)  =  Laplace  transform  of  incident  waveform  fp(t) 

Eq  =  scaling  constant  for  incident  wave  (in  V/m) 

Tg  =  coordinate  on  the  surface  S  of  the  object. 

The  problem  of  computing  aircraft  SEM  parameters  has  been  considered  by 
several  investigators.  Because  of  the  complexity  of  aircraft  surface 
geometry,  it  is  difficult  to  formulate  and  solve  the  scattering  problem 


analytically  for  an  exact  model.  When  an  electromagnetic  pulse  (EMP) 
strikes  an  aircraft,  it  induces  a  surface  current.  For  EMP  testing, 
several  sensors  are  placed  at  different  locations  of  the  aircraft  sur¬ 
face  to  measure  this  current.  Also  the  angle  between  the  incidence  dir¬ 
ection  and  an  aircraft  is  changed  to  obtain  multiple  EMP  data  sets  for 
several  directions  of  incidence  at  each  observation  location.  The 
source  of  noise  is  mostly  from  nonlinear  distortion  and  quantization  in 
the  instrumentation  system.  A  typical  number  of  samples  for  each  data 
set  is  512  with  a  sampling  interval  of  5  nsec.  It  is  desired  to  esti¬ 
mate  SEM  parameters  of  Equation  (68)  given  noisy  multiple  observations. 

As  a  special  case  of  the  general  Equation  (68),  we  consider  a 
simple  thin  wire  scatterer.  Suppose  the  wire  is  struck  by  a  plane 
electromagnetic  wave.  The  direction  of  propagation  d"  of  the  incident 
field  makes  an  angle  6  with  the  normal  to  the  center  of  the  scatterer  as 
shown  in  Figure  6.  The  induced  current  on  the  scatterer  for  any  e  and 
0£ii£L  is  represented  as  [19]. 

J(t,9,ii)  =  I  n,^(9)j|^(t)f(S|^)  exp(S|^t)u(t-tQ)  (69) 

+  other  SEM  terms  +  noise 

Note  that  the  parameters  in  (68)  are  now  simplified  to  one-dimensional 
case.  Unlike  the  complicated  aircraft  case,  the  SEM  parameters  for  a 
thin  wire  have  been  determined  by  theoretical  methods  based  on  Maxwell's 
equations  [3].  The  normalized  natural  mode,  which  describes  the  spatial 
amplitude  variation  in  the  induced  current,  is  given  approximately  by  a 
simple  equation 

j|^(t)  =  sin  (knt/L). 


There  is  no  simple  equation  for  the  coupling  coefficients.  But  from 
Figure  6,  it  is  easy  to  see,  for  example,  that  n|^(90°)  =  0  for  all  k 
and  that  n2(0“)  -  0»  etc. 

Hereafter  we  consider  an  impulse  as  an  exciting  function  so  that 
f(S|^)  =  1.  We  also  assume  that  the  other  SEM  terms  are  zero,  and  that 
the  noise  is  additive  Gaussian  white  noise.  In  general  it  is  hard  to 
estimate  natural  frequencies,  coupling  coefficients,  and  natural  modes 
at  a  same  time.  Our  strategy  is  first  to  get  an  estimate  of  poles  using 
multiple  data  sets.  We  discuss  this  problem  in  the  present  chapter.  In 
the  next  chapter  we  show  how  to  estimate  the  other  parameters. 

6.1  POLE  ESTIMATION  USING  MULTIPLE  OBSERVATIONS 

In  this  section,  we  present  an  algorithm  for  pole  estimation  given 
multiple  observations.  Equation  (68)  shows  that  measurements  made  at 
different  locations  or  with  different  directions  of  incidence  have  the 
same  poles.  We  can  use  multiple  data  sets  efficiently  to  get  an 
improvement  in  the  estimate  of  natural  frequencies.  A  currently  exist¬ 
ing  simple  technique  for  using  multiple  data  sets  is  simply  to  compute 
poles  for  each  data  set  and  then  average  the  results  [20].  This  is  not 
the  best  approach  as  we  argue  below. 

Let  the  vector  y(^»'')  be  the  observation  vector  at  each  location 
l  {i  =  1,  2,  ...,  L)  for  each  incidence  direction  9^  (i  =  1,  2,...,I). 
There  are  M  =  LI  observations  available.  Without  considering  incidence 
directions  and  observation  locations,  any  observation  vector  for  m  =  1, 
2,  ...,  M  can  be  represented  as 


is  zero-mean,  uncorrelated  noise.  Because  natural  frequen- 


where  e„ 
n 

cies  do  not  depend  on  m,  i.e.,  each  data  set  shares  the  same  transfer 
function  denominator  coefficients  but  has  different  numerator  coeffic¬ 
ients,  we  set  up  an  equation  as  follows. 


,(m) 


=  FY, 


'”>x,  t  Fy<”> 


Fl> 


(m) 


(71) 


for  m  =  1,  2,  ...,  M 

where  are  noise  vectors  and  F,  are  defined 


in  (31).  Suppose  the  ncrise  variance  is  the  same  for  all  m.  By  intro¬ 
ducing  the  augmented  vector  e,  (71)  becomes 
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e  =  f  (^L  Xj  +  y)  -  ®  (72) 

Now  we  want  to  minimize  J  =  e  e.  If  the  noise  is  Gaussian  then  the  Xj 
and  a  that  minimize  J  are  the  maximum  likelihood  estimates.  Note  that  0 
is  a  sum  of  squares  of  all  the  errors.  Fixing  f  as  a  constant  matrix 
and  setting  the  derivatives  of  0  with  respect  to  Xj  and  a,  we  get  two 


conditions 


{73a) 


(73b) 


It  is  easy  to  see  that  the  matrix  is  nonsingular.  Thus,  from 

(73a)  we  get 

5  »  (^L*  *  7)  (74) 

Substituting  (74)  into  (73b)  we  obtain 

(fTj*  3  f^L  ‘<1  •  -  (^l)*  3  ^  (75) 

where  3  =  I  -  f|_* 

(75)  is  rewritten  simply  as 

I  ]*  Q  Xj  = 

'w  m 

where  Q  =  I  -  ^ 

Since  F  (and  so  Fl)  depends  on  Xj,  we  can  use  (76)  but  with  itera¬ 
tions.  It  should  be  noted  that  the  computational  burden  associated  with 
this  algorithm  is  considerably  less  than  that  with  the  averaging  scheme 
which  takes  an  average  of  poles  obtained  using  the  IPA  for  each  data 


[FY^{m)^* 


Q  Fy 


(m) 


(76) 


*  —1 

set.  The  latter  calculates  inverse  filter  coefficients,  and  at 

each  iteration,  and  the  roots  of  the  polynomials  after  obtaining  conver¬ 
gence  for  each  data  set,  but  the  former  calculates  these  just  once. 

So  far,  the  assumption  was  that  the  variance  of  the  noise  is  the 
same  for  all  m.  But  in  general,  the  noise  variance  may  be  different  for 
each  data  set.  If  the  SNR  is  known  a  priori,  it  is  possible  to  weigh 
the  data  sets  differently  for  each  m.  For  example,  the  weight  for  the 
data  set,  whose  SNR  is  relatively  smaller  than  others,  should  be  smaller 
than  other  weights.  Because  the  Equation  (76)  is  in  quadratic  form  one 
possible  form  of  weight  is 

«  =  »<"’>'»<''>/  (N»  2) 
m  m 

where  is  the  variance  of  the  noise  for  mth  data  set  and  w('")  is 
an  observation  vector  without  noise.  In  practice,  noise-free  data  sets 
are  not  available  so  that  y  can  be  substituted  for  w.  The  purpose  of 
weighing  data  sets  like  above  is  to  emphasize  the  relatively  good  data 
sets,  and  to  weaken  the  bad  data  set,  for  instance  from  a  location  at 
which  natural  mode  is  supposed  to  be  pretty  small  for  all  k. 

Considering  weight  (76)  is  rewritten  as 


(77) 


where  Q  is  as  in  (76). 


Simulation  results  in  Chapter  8  show  that  both  the  bias  and  the  sample 
variance  using  the  Equation  (76)  are  less  than  those  with  the  averaging 
scheme. 

6.2  C-R  BOUND 

In  this  section  we  calculate  C-R  bound  for  the  characteristic 
equation  coefficients,  for  which  sample  variances  can  be  compared,  given 
multiple  observations.  Suppose  are  real  numbers.  Using  the 

results  obtained  in  Chapter  4,  it  is  easy  to  show  that  for  Gaussian 
noise  the  conditional  density  function  of  any  y('’’)  is  rewritten  as 

p(y^"'^|xj)  =  (78) 

(2n)-N/2^detv("'V^^exp[-  W 

where  ^ 


Thus  the  conditional  density  function  of  the  augmented  vector  y  becomes 


and  the  derivative  of  the  log  of  this  is 


log  p(y|xj)  =  -  i? J  iT  “^(y  +  . 

The  second  derivative  of  the  log  of  (79)  is  given  by 


a 

axi 


[  ^  log  p(yixj)]  ^ 


-1 


Therefore,  C-R  bound  for  the  coefficients  of  the  characteristic  equation 
is  given  by 

"r  (S,  - »,)  i [[  r  “V'”’  “l'"’]'*]  h  (80) 

where  9i  =  b^.i+i  for  i  =  l,  2,  ....  k. 

If  the  noise  variance  is  the  same  for  all  m,  then  (84)  becomes 
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CHAPTER  7 

ESTIMATION  OF  COUPLING  COEFFICIENTS  AND  NATURAL  MODES 


In  this  chapter  an  algorithm  for  estimating  coupling  coefficients, 
natural  modes,  and  normalization  factors  given  natural  frequencies  is 
presented. 

The  induced  current  data  on  the  scatterer  for  any  incidence  direc¬ 
tion  0.  and  observation  location  i  is  represented  as  (from  (69)) 

(i.t)  _  r  „  (max)  „  (i)  i  (i)  ,  n  (Ui)  /oo 

^n  -  \  \  h  ^  ®n 


for  n  =  0,  1,  ....  N-1 

where  is  the  normalization  factor,  is  the  normalized  coup 

(Si)  fi  tl 

ling  coefficients,  j.  '  '  is  the  normalized  natural  mode,  and  e  ^  .  is 
•  K  n 

the  noise  which  is  assumed  to  be  zero-mean  and  uncorrelated.  Note  that 


(n,.^^^)  =  1,  and  max  =  1 


for  each  k.  Let 


(max) 


47 


(85) 


^  ; 


[  I  (ZJ^^^)*ZJ^^^]  =  [  I 

I  I 

^ OP  i  *  I9  2j  •••}  I  • 

-►*->•  ( z) 

Similarly  using  the  Equation  (83b)  e  e  is  minimized  with  respect  to  j ' 

by  solving  following  equation 

[  I  (Zg(^^)*Z6^^'^  ]  =  [  I  (ZG^''h*y^’*^h  ]  (86) 

1  i 

^or  Jl“  1}  2j  •••)  L  • 

Let  matrices 


G- 

.... 

9<'>]  , 

J  = 

i(2) 

J  >  •  *  •  > 

q(2) 

y  $  •  •  • » 

,  and 

j  = 

;(2) 

J  9  •  •  •  * 

i'^’l  . 

from  (83a) 

and  (83b), 

G  =  HG  and 

(87a) 

A 

J  =  HJ. 

(87b) 

Note  that  the  normalized  coupling  coefficient  (natural  mode)  is  obtained 

A  A 

by  normalizing  the  rows  of  G  (J)  so  that  the  maximum  element  of  each  row 
of  G  (J)  is  1. 
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Now  the  algorithm  is  as  follows. 

1)  Set  as  the  identity  matrix,  solve  (86),  and  normalize 

the  result  to  obtain  an  initial  estimate  of 

2)  Set  using  the  preceding  estimate  of  and  solve  the 

Equation  (85)  to  obtain  (g^’^). 

A  A 

3)  Form  G  and  normalize  every  row  of  G  to  obtain  G  (i.e.,  g'  ') 
and  normalization  factors 

4)  Set  (G^^^)  using  the  preceding  estimate  of  and  solve  the 

A  /  n  \ 

Equation  (86)  to  obtain  (j^ 

A  A 

5)  Form  0  and  normalize  every  row  of  J  to  obtain  J  (i.e,, 
and  the  normalization  factors 

6)  If  convergence  is  obtained  then  stop.  If  not  then  go  to  2)  and 

« 

conti nue. 

If  convergence  is  obtained,  then  e  e  given  (Zj^)  Is  minimized.  The 
normalization  factors  obtained  both  in  procedure  3)  and  5)  may  be 
averaged  to  obtain  a  better  estimate. 


CHAPTER  8 


SIMULATION  RESULTS 


Experiments  I  and  II  in  this  chapter  are  to  support  the  algorithms 
proposed  in  Chapters  6  and  7  respectively. 

8.1  EXPERIMENT  I 


Multiple  data  sets  which  share  the  same  poles  but  have  different 
residues  were  considered  in  this  experiment.  The  system  order  was 
chosen  to  be  K  =  2.  Data  sets  consisting  of  a  pair  of  complex  conjugate 
damped  sinusoids  in  white  noise  were  used.  The  data  set  for  any  m  (m  = 
1,  ....  M)  is  represented  as 


y(m)  ,  (m)  n  -(m)  -  n  (m) 


Number  of  data  points  N  -was  20  and  the  standard  deviation  of  the  noise 
was  0.1  for  all  m. 

For  the  first  example,  four  data  sets  were  used.  The  pole  Zj 
was  -0.7  +  jO.5  so  that  the  characteristic  equation 

B^(z)  =  1  +  1.4z‘^  +  0.74z‘^ 


The  residues  were:  =  1,  =  0.3  +  jO.4, 

Cj^^^  =  -0.4  +  jO.2,  Cj^^^  =  0.7  +  jO.7. 

Method  I  is  the  averaging  scheme  which  takes  the  average  of  coefficients 
obtained  using  the  IPA  for  each  data  set  separately.  Method  II  is  the 
algorithm  presented  in  Chapter  6.  The  C-R  bound  for  the  coefficients 
was  calculated  using  noise  free  data.  The  variances  with  respect  to  the 
true  coefficients  were  calculated  after  100  trials  for  each  method. 
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Table  2(a)  shows  that  both  the  bias  and  the  sample  variance  with  method 
II  were  about  the  half  of  those  with  method  I.  Note  that  sample 
variances  with  method  II  were  very  close  to  the  C-R  bounds. 

For  the  second  example,  6  data  sets  were  used  and  zi  =  -0.8  + 
jO.4  so  that 

B^lz)  =  1  +  1.6z"^  +  0.8z“^  . 


E 


The  residues  were:  =  U  ~  0*3  jO.5 

c^^^^  =  -0.5  +  jO.2,  c^^^^  =  0.8  +  jO.3, 

c^^^^  =  -0.7  +  jO.7,  c^^®^  =  0.6  +  jO.2. 

Table  2(b)  shows  that  method  II  outperformed  method  I. 

For  the  third  example,  10  data  sets  were  used  and  z^  =  0.75  + 
jO.48  so  that 

B3(z)  =  1  +  1.5z"^  +  0.7929z"^  . 

The  residues  were:  =  1» 

Cj^^^  =  0.7  +  jO.4,  =  -0.5  +  jO.3, 

Cj^^^  =  -0.4  +  j0.3,.Cj^®^  =  0.2  +  jO.5, 

Cj^^^  =  0.3  +  jO.3,  Cj^®^  =  -0.2  +  jO.3, 


Table  2.  Companson  of  Pole  Estimators 


bias: 


I 


bi  as ; 


I 


A 

var  (b^-b.) 

I 

II 

C-R  Bound 

3.85x10-** 

1.67x10-** 

1.79x10-** 

2.67x10-** 

1.08x10-** 

_ 

1.12x10-** 

U) 

var  (b^-b^ ) 

I 

II 

C-R  Bound 

1.01x10-** 

6.16x10-5 

5.54x10-5 

7.74x10-^ 

4.99x10-^ 

4.42x10-5 

(b) 

V( 

>r  (b^-b. ) 

I 

II 

C-R  Bound 

8.53x10-^ 

5.22x10-^ 

3.94x10-5 

8.11x10-^ 

3.95x10-^ 

3.38x10-5 
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Again  Table  2(c)  shows  that  method  II  was  a  better  estimator  than  method 
I.  Especially  the  bias  for  b^  with  method  II  is  about  the  one-fifth 
of  that  with  method  I. 

From  all  three  examples,  sample  variances  were  pretty  close  to  the 
C-R  bounds. 

8.2  EXPERIMENT  II 

In  this  experiment  we  used  25  data  sets  from  5  observation  loca¬ 
tions  for  5  incidence  directions.  The  SEM  parameter  values  are  repre¬ 
sentative  of  those  for  a  thin  wire  [19],  Any  data  set  for  i  =  1 . 5 

and  i  =  1,  ....  5  is  given  by 

w  (^‘^5  =  t  (i)-.(i)7  n  -  (max)-  (i)^  {l)-  n.  .(i.A) 

"  k=l  ^  ^  \  ^  \  \  \  \  ^  ® 

for  n  =  0,  1,  ...»  49 

( i  z  1 

where  e'  ’  '  is  an  uncorrelated  zero-mean  noise  vector  with  standard 
deviation  a. 

Three  complex  conjugate  z-pole  pairs  (s-poles  in  [20]  were 
translated)  were: 

Zj  *  0.5589  +  jO.7325,  z^  =  -0.2831  +  jO.8396, 
and  z^  *  -0.8320  +  jO.2237 

Normalization  factors  were: 

^^(max)  ^  .7^0728  -  jl.9371,  =  1.4320  -  j4.6624. 


and  *  3.9331  +  jl.2755 


k  =  2  0.02  0.76+j0.01  1.0  0.92-j0.021  0.48-j0.028 

k  =  3  -0.74-j0.018  j0.02  0.76+j0.025  1.0  0.60-j0.03 


where  0^  =  0°,  02  =  20°,  0^  =  35°,  0^  =  48°,  and  0^  =  70°. 

Natural  modes  were  taken  as: 


1*1 

1*2 

1=3 

1=4 

1=5 

k  =  1 

0.52-j0.01 

0.71-j0.007 

1.0 

0.71-j0.007 

0.52-j0.01 

k  =  2 

0.088-j0.014 

1.0 

0.02 

-1.0 

-0.88+j0.014 

k  =  3 

1.0 

0.71+j0.032 

1.0 

0.72+j0.032 

1.0 

where 

five  locations 

on  the  thin  wire 

were  at 

0.18  m,  0.245 

y  0  •  5  ^  t 

0.755  m,  and  0.82  m  (the  length  of  the  wire  was  1  m). 

For  the  first  example,  a  -  0.5.  Exact  values  of  z-poles  were 
given.  Each  parameter  converged  to  within  an  order  of  10"®  at  5th  iter¬ 
ation.  The  sum  of  the  absolute  values  of  the  estimated  parameter  errors 
is  shown  in  Table  3  as  a  measure  of  fit. 
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We  intentionally  omit  the  results  of  some  iterations  because  there  were 
no  big  changes  in  the  errors  after  the  2nd  iteration.  Table  3  shows 
that  the  errors  in  the  natural  modes  decreased  significantly  at  the  1st 
iteration.  But  after  the  1st  iteration  the  errors  seem  to  remain  almost 
the  same. 


Table  3.  Sum  of  Errors  at  Each  Iteration  (a  =  0.5) 


init.  est 


1.43337 


1st  iter. 
2nd  iter. 


0.69081  0.71039  1.36592 

0.62938  0.72258  1.30901 


6th  iter 


0.62042 


0.72486 


1.30044 


CHAPTER  9 


CONCLUSIONS 

In  this  dissertation  we  have  introduced  two  new  algorithms  for 
estimating  poles  given  noise-contaminated  impulse  response  data.  One  is 
an  algorithm  to  extract  the  reduced  characteristic  equation  from  the 
weakest  eigenvectors  when  the  system  order  is  overdetermined.  This 
algorithm  is  closely  '■elated  to  Henderson's  deflation  algorithm  [16]  and 
simulation  results  show  that  new  method  performed  better  than  the  exist¬ 
ing  method.  The  other  algorithm  is  the  iterative  preprocessing  algor¬ 
ithm  (IPA)  which  is  related  to  the  Steiglitz-McBride  method  [6,7]  and  to 
the  Evans-Fischl  technique  [18]  but  has  an  advantage  over  either  in  both 
stability  and  computational  complexity.  The  approximate  IPA  (AIPA) 
which  reduces  the  computational  burden  further  was  described.  The  AIPA 
for  the  pure  sinusoid  case  is  related  to  Kay's  IFA  [7]. 

Also  the  C-R  bound  for  the  estimated  characteristic  equation  coef¬ 
ficients,  which  is  a  valuable  tool  for  evaluating  different  estimators 
without  finding  roots  of  the  equation  was  evaluated. 

The  IPA  was  extended  to  SEM  parameter  estimation.  Using  the  IPA, 
it  is  possible  to  process  multiple  data  sets  at  the  same  time  to  get  an 
improvement  in  pole  estimation.  Finally,  an  iterative  scheme  to  esti¬ 
mate  coupling  coefficients,  and  natural  modes  was  introduced.  Simula¬ 
tion  results  indicate  that  the  estimation  errors  decrease  most  at  the 
first  iteration  after  the  initial  estimate.  Together  these  results  pro¬ 
vide  a  way  to  estimate  the  SEM  description  of  a  scatterer  from  multiple 
data  sets  taken  at  different  locations  and  with  different  directions  of 
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incidence.  For  large  to  moderate  signal-to-noise  ratios  these  estimates 
meet  the  C-R  bound  and  thus  have  the  minimum  variance  of  any  unbiased 
estimators. 

So  far  the  theoretical  convergence  and  accuracy  properties  of  the 

IPA  are  not  known.  The  IPA  assumes  the  system  order  is  given.  The  next 

phase  of  research  should  be  to  investigate  the  convergence  property  of 

the  IPA,  and  to  devise  an  algorithm  combined  with  order  selection. 

Another  interesting  problem  would  be  to  improve  the  AIPA.  As  discussed 

in  Chapter  3,  the  AIPA  approximates  (Fj^  would  be  useful  to 

*  * 

have  an  approximation  for  F^^[F|^  F^^J  F^^  instead. 
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APPENDIX  A 

The  data  matrix  W  and  the  matrix  G  are  as  in  (13)  and  (15) 
respect i vely. 

Th.  Every  column  g^,  of  G  is  in  the  null  space  of  W. 


(proof)  Let  be  the  jth  row  of  the  matrix  W.  If  =  0 

for  all  j  =  1,  ...,  N-M  and  m  =  1,  ...,  M-K+1,  then  the  proof  is  done. 

M  T  M  T 

Let  =  [1,  z,  ...,  z  ]  and  u^  =  [1,  z^,  ....  z.  ]  where  (z^)  are  system 

K  T  K  T 

poles.  And  let  v  =  [1,  z,  ....  z  ]  ,  v^  =  [1,  z^ ,  ...,  z^  j  and  x  = 
fb  ,  bj.  b  ]^.  Then  g_^u.  =  (x^V . )  =  0  for  all  m,i. 

TN  N*a  0  Hi  1  1  I 


Now 


[,ij  . ,|j  1 


j+M-1 


It'-.- 


1  l  I  cz->-‘U^) .  I  1 

i=l  ’  ’  i=l  ’  ^  i*l  ^  ^ 


J-1 


3-h,  M, 


I  6^u.^  (6^  =  c^z^'h 

i=l  ^  ^  1  ^ 


r. . 

k: 


Thus, 


w.g„  =  g„^w.^ 
j  ^m  ^m  j 


T 

g  y  a.u. 
^m  n  1 


J^s,g/u,  .  0 


Q.E.D. 


.V, 
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APPENDIX  B 


Here  we  want  to  show  that  the  Equation  (36)  is  identical  to  the 
Equation  (40). 


By  partitioning,  F  and  Y  become 


=  [Fl  i  F,] 

K  N-L 


’•^0  ' 

F^  = 

0 

1^1  J 

3  N-K 

R 

.'2. 

^N-K 


(B-1) 


Jo. 

-< 

0 

r“ 

j  ^OR 

Y 

3  N-K 

Iv 

1  yR 

K  1 


Using  (B-1),  (36)  becomes 

(B-2)  (FJol  ^  F,YJ*  Q  (FJol  .  F.Y^) 

where  Q  =  I  -  Fl(F|_*Fl)"^Fj_* 


Equation  (B-2)  is  simplified  to 
(B-3)  (Vl)*!'  -  *I 

•  -  -  fLffL*fL)''F  *]  F„y„ 


From  (B-1),  Fj^  F|^  *  F^  and  Fj^  Fj^  ~  ^2  ^2  ’ 


Thus,  {B-3)  is  rewritten  as 


(B-4)  ‘i 

=  -  (F2\)*[I  -  FitFi.*fLr^fi*l  Vr 

Equation  (40)  is 

(B-5)  Yl*(DD*)'^Yl  Xj  =  -  YL*(00*)-^y^ 

P 

If  (DD*)"^=  F2*[I  -  Fi(FL*FLr^''i*]  ''2* 
cal  and  the  proof  is  done. 


Tlu  (DD*)‘^  =  F2*[I  -  FiCFl^Fl^'^'"!*^  ''z 
(proof)  By  partitioning,  B  becomes 


D  =  [D,  lOj] 


It  is  easily  shown  that 
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APPENDIX  C 

Let  F(N)  be  an  NxN  matrix  as  defined  in  Chapter  3.  In  this 
appendix,  we  want  to  describe  what  happens  to  Fl(N)*Fl{N)  for 
large  N.  Note  that  notations  are  slightly  changed  in  this  appendix. 
Consider  the  infinite  matrices 


K7. 


A'.' 


B  = 


Jo. 

I  0 

}< 

*-«v 

F  =  B"^  = 

•  . 

*^0  ! 

1 

0 

«3 

S 

[‘bV; 

^  ! 

L  * 

W-'  w/ 

K 


B, 


It  is  easy  to  show  that 

fo  ■ 


K' 


i 

K 

ti 


(C-1)  F^  =  82 


-1 


F^  =  -  F2B2FQ 


For  convenience,  assume  matrices  B  and  F  are  real.  But  the  result  can 

o» 

be  extended  to  complex  case  immediately.  Also  assume  ^  f.f.  .  <  » 

j=i  ^ 

for  i  =  0,  1,  ...,  K. 


It  is  obvious  that 


R  =  lim  fJ(N)  Fl(N)  =  Fq^Fq  +  fJf^ 
N 


t  - 


•V.' 


R  can  be  interpreted  as  an  autocorrelation  matrix  of  the  inverse  filter 
coefficients  such  that 


(C-2)  R  = 


'■q 

r 


K-1 


1  ^0 


r, 


K-2 


^K-1  ''k-2 


where  r, 


and 


(C-3)  R  Xj  =  -  r  where  r  =  [r^,  r^,  ....  rj^]^  . 

Because  we  consider  infinite  matrices,  the  following  holds  (by  C-1) 


Fi  =  -  F2B3Fg  =  - 


'fo 

0 

'^1 

=  - 

0 

e  t 

If 

so  that 


(C-4)  fJFj  -  F„^bJ  R  BjFj 


Now  we  want  to  show  that  R’^  = 


The  R"^  =  BgBg^  -  B Jb^ 


(proof)  Denote 


r  0  1 1 


PremuUiplying  (postmuUiplying)  a  matrix  by  J  results  in  swapping  all 
the  rows  (columns)  of  the  original  matrix.  Then  one  can  show  that 
(using  C-3) 

B^RBqJ 

is  always  symmetric.  Thus, 

(C-5)  Bj^^(Rj^RBqJ)B^  is  symmetric. 

It  can  be  shown  easily  that 
(C-6a)  JB^  =  Bj^^J  and  that 

(C-6b)  is  symmetric. 

From  (C-5), 

(C-7)  bJBjRBqJBj  =  Bj^BjRBqBJj  by  (C-6a) 

=  (BqBj^^J )^RB^^Bj  by  symmetry 

=  BqBJjRBj^Bj  by  (C-6b) 

Because  B^  is  nonsingular  for  bj^  *  0,  postmultiplying  on  both  sides 
of  (C-7)  yields 

bJbjRBqJ  =  BqbJorbJ 

b^^BjRBq  =  BqbJjrbJj 

*  BgBj(JRJ)B^ 


by  (C-6a) 


APPENDIX  D 


In  this  appendix  we  want  to  show  that 


-1 


(D-1)  lim  [Fl(N)  Fl(N)]  =  0 


when  the  roots  of  the  characteristic  Equation  B(z)  =  0  are  on  the  unit 
circle.  Suppose  the  roots  are  distinct,  then  each  fp  can  be  rewritten 
as 

K  K 

(D-2)  a-  exp  (je^.n) 

where  0  £  9^  <  2n 


It  is  easily  shown  that 


(0-3)  11ml  exp[J( V®k>"l 


n=0 


Using  (0-3),  it  is  easy  to  show  that 


(D-4)  litn  " 


A  A 


'*0 


0  1 


hermi  • 
tian 


1  if  m  *  k 
LO  if  m  ^  k 


k-1 


r„  r,  •••  r 


k-2 


•  • 


z 

where  r_  =  7  exp(-jni0. )  laJ  ,  for  m  =  0,  1,  K-1. 

Ill  j  1  I 


'■  1 1* *■  '•■  '■  L'. L'. '.■-'’■i I'M 


i 


V 


I 


f:- 


E-;-; 

L-- 


W*. 

V 

•• 

w* 
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Let  Q  = 


exp[ je^] 
exp[ j20^] 


exp[ j02] 
exp[j202] 


...  exp[j0^] 
...  exp[j20,^] 


exp[  j(t(-l)0  J  exp[  j(K-l)02]  ....  exp[  j(K-l)0|^] 


and 


A  =  diag  [la^|^  ,  ,  ....  then 

(D-4)  is  rewritten  as 

R  =  OAQ*  . 

Because  0  is  nonsingular  and  A  is  positive  definite  and  diagonal,  it  is 

A 

shown  easily  that  R  is  positive  definite  and  invertible. 

Let  R(N)  =  i  F^(N)*Fj_(N)  . 

* 

As  mentioned  in  Chapter  3,  F^(N)  Fj^(N)  is  nonsingular  for  each  N  and  so 
is  R(N). 

Fl(N)*Fl(N)  =  NR(N) 

[Fl(N)*Fl(N)]  ^  R(N)"^ 

It  is  shown  [7:Ch.4]  that  if  A  is  nonsingular  and  lim  A(k)  =  A  then  for 
all  sufficiently  large  k,  A(k)  is  nonsingular  and  lim  A(k)"l  «  A-1. 

A  A  A 

Because  lim  R(N)  *  R  and  R  is  nonsingular, 

11m  [F, (N)*F,  (N)]  -  Urn  i  R  =  0  . 

N-*«  N-h»  ^ 


m 


■-•.04 

r 


